Goto

Collaborating Authors

 training vector



Beating SGD: Learning SVMs in Sublinear Time

Neural Information Processing Systems

We present an optimization approach for linear SVMs based on a stochastic primal-dual approach, where the primal step is akin to an importance-weighted SGD, and the dual step is a stochastic update on the importance weights. This yields an optimization method with a sublinear dependence on the training set size, and the first method for learning linear SVMs with runtime less then the size of the training set required for learning!


Quantum Perceptron Models

Neural Information Processing Systems

We demonstrate how quantum computation can provide non-trivial improvements in the computational and statistical complexity of the perceptron model. We develop two quantum algorithms for perceptron learning. The first algorithm exploits quantum information processing to determine a separating hyperplane using a number of steps sublinear in the number of data points N, namely O( N).


Residual Quantization with Implicit Neural Codebooks

Huijben, Iris, Douze, Matthijs, Muckley, Matthew, van Sloun, Ruud, Verbeek, Jakob

arXiv.org Artificial Intelligence

Vector quantization is a fundamental operation for data compression and vector search. To obtain high accuracy, multi-codebook methods increase the rate by representing each vector using codewords across multiple codebooks. Residual quantization (RQ) is one such method, which increases accuracy by iteratively quantizing the error of the previous step. The error distribution is dependent on previously selected codewords. This dependency is, however, not accounted for in conventional RQ as it uses a generic codebook per quantization step. In this paper, we propose QINCo, a neural RQ variant which predicts specialized codebooks per vector using a neural network that is conditioned on the approximation of the vector from previous steps. Experiments show that QINCo outperforms state-of-the-art methods by a large margin on several datasets and code sizes. For example, QINCo achieves better nearest-neighbor search accuracy using 12 bytes codes than other methods using 16 bytes on the BigANN and Deep1B dataset.


Kernel Regression and Backpropagation Training With Noise

Neural Information Processing Systems

One method proposed for improving the generalization capability of a feed(cid:173) forward network trained with the backpropagation algorithm is to use artificial training vectors which are obtained by adding noise to the orig(cid:173) inal training vectors. We discuss the connection of such backpropagation training with noise to kernel density and kernel regression estimation. We compare by simulated examples (1) backpropagation, (2) backpropagation with noise, and (3) kernel regression in mapping estimation and pattern classification contexts.


The Quantum Version Of Classification Decision Tree Constructing Algorithm C5.0

Khadiev, Kamil, Mannapov, Ilnaz, Safina, Liliya

arXiv.org Machine Learning

In the paper, we focus on complexity of C5.0 algorithm for constructing decision tree classifier that is the models for the classification problem from machine learning. In classical case the decision tree is constructed in $O(hd(NM+N \log N))$ running time, where $M$ is a number of classes, $N$ is the size of a training data set, $d$ is a number of attributes of each element, $h$ is a tree height. Firstly, we improved the classical version, the running time of the new version is $O(h\cdot d\cdot N\log N)$. Secondly, we suggest a quantum version of this algorithm, which uses quantum subroutines like the amplitude amplification and the D{\"u}rr-H{\o}yer minimum search algorithms that are based on Grover's algorithm. The running time of the quantum algorithm is $O\big(h\cdot \sqrt{d}\log d \cdot N \log N\big)$ that is better than complexity of the classical algorithm.


Feature uncertainty bounding schemes for large robust nonlinear SVM classifiers

Couellan, Nicolas, Jan, Sophie

arXiv.org Machine Learning

We consider the binary classification problem when data are large and subject to unknown but bounded uncertainties. We address the problem by formulating the nonlinear support vector machine training problem with robust optimization. To do so, we analyze and propose two bounding schemes for uncertainties associated to random approximate features in low dimensional spaces. The proposed techniques are based on Random Fourier Features and the Nystr\"om methods. The resulting formulations can be solved with efficient stochastic approximation techniques such as stochastic (sub)-gradient, stochastic proximal gradient techniques or their variants.


Quantum Perceptron Models

Kapoor, Ashish, Wiebe, Nathan, Svore, Krysta

Neural Information Processing Systems

We demonstrate how quantum computation can provide non-trivial improvements in the computational and statistical complexity of the perceptron model. We develop two quantum algorithms for perceptron learning. The first algorithm exploits quantum information processing to determine a separating hyperplane using a number of steps sublinear in the number of data points $N$, namely $O(\sqrt{N})$. The second algorithm illustrates how the classical mistake bound of $O(\frac{1}{\gamma^2})$ can be further improved to $O(\frac{1}{\sqrt{\gamma}})$ through quantum means, where $\gamma$ denotes the margin. Such improvements are achieved through the application of quantum amplitude amplification to the version space interpretation of the perceptron model.


Leveraging over intact priors for boosting control and dexterity of prosthetic hands by amputees

Gregori, Valentina, Caputo, Barbara

arXiv.org Machine Learning

This becomes even more problematic with highly articulated modern prostheses. The natural use of these devices is challenging in everyday life primarily due to the software [2, 3, 4]. The open question is how to reduce this training time while making control as natural as possible. Machine learning has opened a new path to tackle this problem by allowing the prosthesis to adapt to the myoelectric signals of a specific user. Although these methods have been applied with success (e.g., [5] and references therein), they still require a significant amount of data from individual subjects to learn models with satisfactory performance. Consider a situation in which different subjects repeat the same hand postures and suppose that a new target user attempts to learn the same movements. In this case it should be beneficial to reuse the information from the latter subjects and thereby reduce the training data required from a new subject. However, even if a movement appears the same for all subjects, the distribution of their myoelectric signals is very different.


LogisticRegression - mlxtend

#artificialintelligence

Related to the Perceptron and'Adaline', a Logistic Regression model is a linear model for binary classification. However, instead of minimizing a linear cost function such as the sum of squared errors (SSE) in Adaline, we minimize a sigmoid function, i.e., the logistic function: Here, p(y 1 \mid \mathbf{x}) is the conditional probability that a particular sample belongs to class 1 given its features \mathbf{x} . The logit function takes inputs in the range [0, 1] and transform them to values over the entire real number range. In contrast, the logistic function takes input values over the entire real number range and transforms them to values in the range [0, 1]. In other words, the logistic function is the inverse of the logit function, and it lets us predict the conditional probability that a certain sample belongs to class 1 (or class 0).